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What's this Х 
(only Linux !) 


pect on Linux is one of mu Favourite 
debugging tools. T+ lets you: 


* trace system calls faster than strace 

k profile your C, Go, C++, node js, Rust, and Sava / SVM 
programs really easily 

k trace or count almost * any" kernel event 
Ze perf, count how manu packets every 5 


T ve even used it more than once to profile Ruby programs, 
So it's not just for systems wizards. 


This zine explains both how to Use the. most 
impor tant peri subcommands, and o. little bih 


about how pert works under the hood . 






let me show you mu Favourite 
perf features + how T use it! 





Suria Eyan $ 


O börk 
https: // )vns.ca. 


Ù more pert resources ҸӰ 


Thanks for reading Y A few more useful resources: 


Gregg's is mu Tavourite perf resource. His blog 2. 


Y Brend aa – brendan gregg.com/pert. html — 
bloq 


talks ore also useful ! 






LWN isa great Linux publication, and 
they Sometimes publish articles about perf | 


сән 





Linux Weekly News 
LWN.net 


pect has man Pages GS you'd expect. 
"man perf top", foc example. 


most importantly : iy 


— Dick a program and Tru 1o prof ile itr 
— See what your kernel is doing under different workloads! 


—v Try recording (counting afew kinds of perf events and 


see what happens f 
Qood luck f 
$ have foa di 
N 
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pert top 


My Favourite place to stact with pect is 
pech top’. 


N 


top 











1 Know hou much Well T know hou 
CPU everu p”egcamll much CPU every 
IS Using fonction 


T 
A? 


perf top 


iS using 


T like to to^ "Gert Тор" 
O^ machines When a program 
[S usina 10026 67 the CPL 
and I don't Know why . 





As On example, let's profile Q reall simple 
program T weste. T+ has a single P ocior 
[`` con - awesome -fuaelian ) which is an infinite loop. 


Here's the code void run_awesome_function () { 


int x = 9; 
Т ran. T called while (1) £ 
X = X + 1; 
The binary I ) 
| Use. cpu". int main() í run awesome function(); ) 


While that s running, start perf top. Lt needs 
To run as root, like every perf subcommand. 


$ sudo perf top 





pert: under the hood 


Tts often useful to have a basic understanding of 
how Our tools are implemented. So let's look at the 


interface the userspace tool (” perf") uses to talk to 
the Linux Kernel. Here's what happens, basically - 


© perf calls The Dech event. open system call 

@ the kernel writes "events" to o. ring buffer 
(n user space 

(2 pert reads events o$ that ring buffer and 
dis plays them to you Some how 


What's a ring buffer? 
Basically , it's important To use a limided amount 
of memory for profiling events. So the kernel allocates 
O, $ixed amount of memory : 


mara umu umumi 


each of these is space foc 1 record 
Ond when that memory gets full becavse 
new records ore being written faster than perf can 


read them)... DSISISLSISISISTZIEI 


-— 


Linux 





hoops! were out of space , guess T 
can't write more events f 


So if uou see warnings from perf about events 
being dropped, that $ what's happening. 
21 
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pect recocd 


pert top is great for getting a quick idea of what's 
happening, but T often want to investigate more in depth. 


collects the same information as pert top 


but it lets you save the data to analyse later. 


T+ saves it in a file called "perf. data" 
in your Current directory ; 






hey, here's some 


profiling data T 
; 9095 
kere Tit see wy, [pert record] 


o. file called 


perf. data 


There are 3 main Ways to choose what pcocesstes) to 
profile with perf record : 


(E -----,. @ Start COMMAND 
© "perf record COMMANO and profile it until it 


һә exits 


| = 4 «— prot ile PID until you 
10350 press ctrl ec 


wem em em xı am emm em em em om om си 


G) ‘perf record -o. t— profile every process 
1——— —— —-.- üə. 


unlu You press cTrl|+<c 
There's a 4th hybrid thing you can da: i$ you sp ec ifo 
both a PID (or -«Y and a command, it'll profile the PID 
until the command exits. Like this: 


COMM 
57: 


‘perf record -p 8325 sleep 5 | 


. This useful trick lets you profile PIO 8325 for S seconds! 


how profiling with perf] works 


The Linux Kernel has a built in sampling profiler: 


A 


Linux 









1 checked what fonction the program 





WAS running 50,000 times and here 
are the results! 


How does Linux know which functions Your program is 
running though “ Well -- the Linux Kernel is in chara e. of 
scheduling. 


That means that atall times it has a list of every process 
and the address of the CPU instruction that Process is 
Currently ronning. That address is called the instruction pointer. 


Here's what the information the Linux kernel has looks like: 


Command PIO Thread (О instruction pointer 
python 2374 2274 Ox0075q d2d 
bash 1229 1229 Ox OO 1234S6 
Use. CPU 4441 444| Ох abababab 
USe. cpu üqqı 499] Ox ababbbebb 


Sometimes perf can't Figure out how to torn an instruction 
pointer address into a function name. Here's aa exam ple of 


what that looks like: 


17 musterious address ff 


0.00% nodejs nodejs [. ] 0x0000000000759d20 
0.00% V8 WorkerThread [kernel.kallsyms] [k] hrtimer active 
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analyzing pert record data 


5 Ways +o analyze o." perf. doo." file generated by 
perf record: 


f er | quick interactive report showing 
repo ' 
— ЕР | you which functions are used the most 


| 
100, 0096 0, 0076 use cpu use cpu [.] main 
100, 0096 0, 0096 use cpu libc-2.23.so [.] | libc start main 
100, 0096 100, 0076 use cpu use cpu [.] run awesome function 


100% of the әр 5 spent in this Function! 


pert annotate will tell you which 


Ú! J b J / / ^2 
VELA .. 


E pert o.nnotat e 5 | assem blu instructions your Program 
7 d | LS Spend ino most of its time 
! executing (be careful, can be 


off bu one instruction) 


Disassembly of section "text: 


4 
^ lilik çı, aui! 


assembly instructions ! 


00000000004004d6 «run awesome function»: 


run awesome function(): A NU TR ^ 
0.00 : 4004d6: push %rbp his 0.0 the 
0.00 : 400447: mov %rsp,%rbp vg whel& | pen 
0.00 : 4004da: movl $0x0, -0x4(%rbp) e tines bein 
100.00 : 4004e1: addl $0x1, -0x4(%rbp) i 
0.00 : 4004e5: jmp 4004e1 <run_awesome_function+0xb> 


Percent | Source code & Disassembly of kcore for cycles:pp 


pert script. prints oot all the 

| Somples pecf collected as text so 

| you can run Scripts on the outpot 

! Ло до analysis. Like the flomearaph 

1, , Script on the next page | => 
Symbo 


use cpu 23001|19774.727477:) 349732 cycles:pp: 
stock ., 4e1 run awesome function (/home/bork/work/perf-zine/use cpu) 
hace 4f5 main (/home/bork/work/perf -zine/use cpu) 
20830 _ libc start main (/lib/x86 64-linux-gnu/libc-2.23.so) 
8fe258d4c544155 [unknown] ( [unknown] ) 
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d MT ion 


how pect works: overview 


Now that we know how to use perf, lets see 
how it works Y 
The per f system is split into 2 parts: 

(D o program in userspace called "per? ` 

Q a System in the Linux Kernel 


When uoo ron ‘perf record’, ‘perf stat, oc" Geck tap’ 
to get information aboot O. proa ram, here's what happens 


— perf asks the kernel to collect information 


profile this program 
pis s: Collect system calls Y 


Linux 
count network packets V $F 


kerne } 


— the kernel gets samples/ Traces / CPU counters 
From the programs perf asks about. 


— perf displays the data back to you ina 
Chopefully) useful way. 


So here s the bia picture : 


c- -A23 


perf Userspace programs Im 


program Linux analyzing 


kerne | 17 
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pert + e o dp 


QAVO, *, “` 


Normallu with inter preted languages like node. js, perf 
will tell You which inter preter function is running bot not which 
Sava script function is running. Bot: 


(0 tell perf 
с We can help tell per 
s 


This works because both node and Sava haea SIT 


“ ys” in time 












compiler. 
, Ooo 
function my cool fun ( you know, Т ел (8) 
// do a thing actually going to 
} just -in- time compile node.) S 





that to machine code 







SIT compiled instructions 
hey, those instruction s 






Ox of feo ffe Correspond to the 
: (9) my- cool. fon function 
Ox affe bafe node. j S 


node communicates with perf bo writing a file called 
/tmp/ perf- 5010. map 


How to set this up: 


(8) n 
= 
node. js Java 


(D get pect -map- agent From qithub 
(2) find PID of process 
@) create -)jovo.-perf -mo.p. sh $PIO 


node -- pect -basic -prof 


| 
Program.4s | 
| 


pert stat : count any event 


You can actually count lots of different events with 
pect stat. The same events you can record with perf record! 


Here are a couple examples of Using ‘perf stat’ ON 
Is -R (which lists Files recursively, so makes lots of syscalls) 


(D count context switches between the 


Kernel and userspace! 
$ sudo perf stat -e context-switches ls -R / 
Performance counter stats for 'ls -R /': 
20,821 context-switches 


@ count system calls! wildcard 


$ sudo perf stat -e 'syscalls:sys enter *' ls -R / > /dev/null 
ck 8,028 syscalls:sys enter newlstat 
1. can these 15,167 syscalls:sys_enter_write 
Муссон 254, 755 syscalls:sys enter close 
cock -N 254,777 syscalls:sys_enter_open 
X o. 509,496 syscalls:sys enter newfstat 
Xo o£ 509,598 syscalls: sys_enter_getdents , directory 
Хор \isY entries 


Gert stat does introduce some ovechead. Counting Seier < 
System call for “tind” made the program ron up +o 

2.6 times slower in my brief experiments. 

T hink. QS long AS yov only coun} a few different 


events (like just Jhe'syscalls:sys enter. open. event) 
it should be Fine. L don't 100% underst and why there's 
So much overhead here, though. 
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x perf cheat sheet * 


important command line. arguments; 

€ what data to geta 
-F: pick sample Frequency 
79: record stack traces 


-@: choose events to record 


k perf top: get updates live f A 


# Sample CPUs at 49 Hertz, 
perf top -F 49 


show top symbols: 
# Sample CPUs, show top process names and segments: 
perf top -ns comm,dso 


# Count system calls by process, 
perf top -e raw syscalls:sys enter -ns comm -d 1 


4 Count sent network packets by process, 
stdbuf -oL perf top -e net:net dev xmit -ns comm | 


refreshing every 1 second: 


rolling output: 
strings 


q What program(s) To look atw 
-O.: entire system 
- p: specify a PID 
COMMAND : run this cmd 


Бул Фигос 


buio) 
-LUƏNƏ 


k pec? stot: count events Y CPU counters Y x 


# CPU counter statistics for COMMAND: 
perf stat COMMAND 


# *Detailed* CPU counter statistics for COMMAND: 
perf stat -ddd command 


4 Various basic CPU statistics, system wide: 
perf stat -e cycles,instructions,cache-misses -a 


until Ctrl-C: 
-p PID 


4 Count system calls for PID, 
perf stat -e 'syscalls:sys enter *' 


# Count block device I/O events for the entire system, 
perf stat -e 'block:*' -a sleep 10 


* Reporting * 


# Show perf.data in an ncurses browser: 
perf report 


# Show perf.data as a text report: 
perf report --stdio 


# List all events from perf.data: 
perf script 


4 Annotate assembly instructions from perf.data 
4 with percentages 
perf annotate [--stdio] 





for 10 seconds: 


Need kernel debuginfo 


sourced From brendangregg.com [pecf.htm\, 
which has manu mole great examples 


E pect trace . trace system calls & other events x 


# Trace syscalls for PID 
perf trace -p PID 


# Trace syscalls system-wide 
perf trace 
k perf record: record profiling data x 


# Sample CPU functions for COMMAND, at 99 Hertz: 
perf record -F 99 COMMAND 


records into 
perf. dato. Tile 


# Sample CPU functions for PID, until Ctrl-C: 


perf record -p PID 


4 Sample CPU functions for PID, for 10 seconds: 
perf record -p PID sleep 10 


4 Sample CPU stack traces for PID, for 10 seconds: 
perf record -p PID -g -- sleep 10 


# Sample CPU stack traces for PID, using DWARF to unwind stack: 
perf record -p PID --call-graph dwarf 


k pect (есо(д : (есосд Tracing data x 


— records into 
perf. da.to, file. 


4 Trace new processes, until Ctrl-C: 
perf record -e sched:sched process exec -a 


4 Trace all context-switches, until Ctrl-C: 
perf record -e context-switches -a 


4 Trace all context-switches with stack traces, for 10 seconds: 


perf record -e context-switches -ag -- sleep 10 


4 Trace all page faults with stack traces, until Ctrl-C: 


perf record -e page-faults -ag 
* adding new trace event s * 


4 Add a tracepoint for kernel function tcp sendmsg(): 
perf probe 'tcp sendmsg' 


# Trace previously created probe: 
perf record -e -a probe:tcp sendmsg 


4 Add a tracepoint for myfunc() return, and include the retval as a string: 


perf probe 'myfunc%return +0($retval):string' 

# Trace previous probe when size > 0, and state is not TCP ESTABLISHED(1): 
perf record -e -a probe:tcp sendmsg --filter "size > 0 88 skc state !- 1' -a 
# Add a tracepoint for do sys open() with the filename as a string: 

perf probe 'do sys open filename:string' 1% 


